Recursive Whitening Transformation for Speaker Recognition on Language Mismatched Condition

نویسندگان

Suwon Shon

Seongkyu Mun

Hanseok Ko

چکیده

Recently in speaker recognition, performance degradation due to the channel domain mismatched condition has been actively addressed. However, the mismatches arising from language is yet to be sufficiently addressed. This paper proposes an approach which employs recursive whitening transformation to mitigate the language mismatched condition. The proposed method is based on the multiple whitening transformation, which is intended to remove un-whitened residual components in the dataset associated with i-vector length normalization. The experiments were conducted on the Speaker Recognition Evaluation 2016 trials of which the task is non-English speaker recognition using development dataset consist of both a large scale out-of-domain (English) dataset and an extremely low-quantity in-domain (non-English) dataset. For performance comparison, we develop a state-ofthe-art system using deep neural network and bottleneck feature, which is based on a phonetically aware model. From the experimental results, along with other prior studies, effectiveness of the proposed method on language mismatched condition is validated.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recursive Whitening Transformation

متن کامل

Speaker, Accent, and Language Identification Using Multilingual Phone Strings

Currently, approaches based on Gaussian Mixture Models (GMMs) [4] are the most widely and successfully used methods for speaker identification. Although GMMs have been applied successfully to close-speaking microphone scenarios under matched training and testing conditions, their performance degrades dramatically under mismatched conditions. The term “mismatched condition” describes a situation...

متن کامل

I–vector transformation and scaling for PLDA based speaker recognition

This paper proposes a density model transformation for speaker recognition systems based on i–vectors and Probabilistic Linear Discriminant Analysis (PLDA) classification. The PLDA model assumes that the i-vectors are distributed according to the standard normal distribution, whereas it is well known that this is not the case. Experiments have shown that the i–vector are better modeled, for exa...

متن کامل

Ku-ispl Speaker Recognition Systems

Korea University – Intelligent Signal Processing Lab. (KUISPL) developed speaker recognition system for SRE16 fixed training condition. Data for evaluation trials are collected from outside North America, spoken in Tagalog and Cantonese while training data only is spoken English. Thus, main issue for SRE16 is compensating the discrepancy between different languages. As development dataset which...

متن کامل

From Features to Speaker Vectors by means of Restricted Boltzmann Machine Adaptation

Restricted Boltzmann Machines (RBMs) have shown success in different stages of speaker recognition systems. In this paper, we propose a novel framework to produce a vector-based representation for each speaker, which will be referred to as RBMvector. This new approach maps the speaker spectral features to a single fixed-dimensional vector carrying speaker-specific information. In this work, a g...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Recursive Whitening Transformation for Speaker Recognition on Language Mismatched Condition

نویسندگان

چکیده

منابع مشابه

Recursive Whitening Transformation

Speaker, Accent, and Language Identification Using Multilingual Phone Strings

I–vector transformation and scaling for PLDA based speaker recognition

Ku-ispl Speaker Recognition Systems

From Features to Speaker Vectors by means of Restricted Boltzmann Machine Adaptation

عنوان ژورنال:

اشتراک گذاری